Method of symbol detection for MIMO dual-signaling uplink CDMA systems

ABSTRACT

The invention provides a block-wise successive interference cancellation (SIC) detection algorithm for a general multi-input multi-output (MIMO) CDMA system over the frequency-selective channels, in which each user&#39;s data stream can be simultaneously applied with orthogonal space-time block encoding for transmit diversity or spatially multiplexing for high spectral efficiency according to the channel conditions. For such the considered dual-signaling system, the receiver could suffer from the large dimension data processing. A two-stage approach for block-wise SIC detection algorithm is thus proposed to further reduce the computational complexity.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims priority under 35 U.S.C. §119 of Taiwanese Patent Application No. 094115211 dated May 11, 2005 the disclosure of which is hereby incorporated by reference.

FIELD OF THE INVENTION

This invention relates to communication systems and, more particularly to the multi-input multi-output (MIMO) communication systems.

BACKGROUND OF THE INVENTION

Several prior techniques have been developed to process and deal with the MIMO communication system including:

(1) In MIMO systems, it is known that high spectral efficiency and high quality can be achieved by exploiting the spatial multiplexing (SM) scheme ‘[G. D. Golden, G. J. Foschini, R. A. Valenzuela, and P. W. Wolniansky, “Detection algorithm and initial laboratory results using V-BLAST space-time communication structure,” Electronic Letters, vol. 35, no. 1, pp. 14-161, January 1999] (hereafter referred to as REF. 1) and space-time coding (STC) [V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block codes from orthogonal designs,” IEEE Trans. Inform. Theory, vol. 45, no. 7, pp. 1456-1467, July 1999] (hereafter referred to as REF. 2) scheme, respectively. Such schemes can be directly applied to the multiuser (MU) systems yielding an MU SM system or an MU STC system. However, whenever in which the system, the data streams of all the users must be transmitted under the same mode and cannot be switched. This is very inflexible and cannot achieve the best performance for a general system link requirement and/or wide channel conditions.

(2) Naguib's 2-step Method [A. F. Naguib, N. Seshadri, and A. R. Calderbank, “Applications of space-time block codes and interference suppression for high capacity and high data rate wireless systems,” Proc. 32th Asilomar Conf. Signals, Systems, and Computers, vol. 2, pp. 1803-1810, 1998] (hereafter referred to as REF. 3) can be directly implemented an MU STBC system. In this scenario, the overall detection framework can be simply regarded as a parallel interference cancellation (PIC) scheme followed by a local ML search. In such a processing, the signal detection thus cannot enjoy the increased receive diversity gain through the PIC step. On the other hand, this method is based on the ML metric to decide the optimal detection order. This may achieve better detection performance but, however, attain a large computational cost.

(3) The method proposed in [V. Tarokh, A. Naguib, N. Seshadri, and A. R. Calderbank, “Combined array processing and space-time coding,” IEEE Trans. Inform. Theory, vol. 45, no. 4, pp. 1121-1128, May 1999] (hereafter referred to as REF. 4) applies the BLAST algorithm for signal detection followed by an ML search in single user (SU) systems. However, the algorithm is mainly based on the space-time trellis codes, and do not exploit the codeword's algebraic structures for decoding.

(4) in the Stamoulis's method [A. Stamoulis, N. Al-Dhahir, and A. R. Calderbank, “Further results on interference cancellation and space-time block codes,” Proc. 35th Asilomar Conf Signals, Systems, and Computers, vol. 1, pp. 257-261, 2001] (hereafter referred to as REF. 5) is a pure interference cancellation scheme through appropriate linear transformation based on the algebraic structure of orthogonal based space-time block coding (O-STBC) to decouple a user's data stream one at a time over the MU STBC systems. At each stage, there are no increased degrees-of-freedom that can be retained after the interference cancellation step for further interference suppression/signal detection at the next stage. This causes that it cannot enjoy the increase in receive diversity as the algorithm goes on, even if it is combined with some power ordering strategy.

SUMMARY OF THE INVENTION

We propose a group SIC detection algorithm for a general MIMO CDMA systems, in which each user's data stream can be either orthogonal space-time block encoded for transmit diversity or spatially multiplexed for high spectral efficiency according to the channel conditions.

Based on the rich and distinctive structures imbedded in the resulting channel matrix, we derive a well-performance and computationally efficient detector. The algorithm can be described as:

(1) A flexible MIMO transceiver is suggested for uplink CDMA systems over the frequency-selective channels as depicted in FIG. 1.

(2) The data streams transmitted form each mobile terminal can be either spatial multiplexed (e.g., vertical Bell laboratories layered space-time, V-BLAST) for achieving high data rate or orthogonal space-time block encoded (e.g., orthogonal space-time block code, O-STBC) for transmit diversity.

(3) At the base station, the received data is despread, linearly combined with the channel matrix followed by an ordered successive interference cancellation (SIC) algorithm to detect the transmitted symbols from each mobile terminal.

(4) For such the considered dual-signaling system, the receiver could suffer from the large dimension data processing. However, based on the algebraic structure of the O-STBCs and through judiciously exploiting it, it can be shown that an attractive block-wise implementation of the SIC algorithm can be achieved to restore the algorithm complexity back.

(5) The imbedded algebraic structure resulting in the channel matrix is further exploited for developing a low-complexity recursive-based detector. It is shown that the calculation of the weights of the V-BLAST detection at each iteration is not computed but rather directly obtained from the information at the previous iteration without any matrix inversion.

(6) To solve the time delay problem caused by STBC signal, this invention proposes a 2-stage group SIC detection algorithm in the dual signaling system, and this can reduce the computational complexity.

(7) The proposed flexible MIMO transceiver can be applied to the B3G high-speed uplink communications.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the proposed receiver-transmitter configuration of this invention, in which a dual-signaling transmitter and a block-wise SIC detector is proposed.

FIG. 2 shows the structure of a matched filter channel matrix F, of this invention.

FIG. 3 shows the implementation of the symbol detection based on an iterative block-wise SIC algorithm with low computational complexity, which proposed in this invention.

FIG. 4 shows the average bit error rate (BER) of SM transmission and of dual-signaling transmission; and

FIG. 5 shows the average bit error rates versus signal/noise ratio (SNR) for three different detection methods in a dual-signaling system over a Ricean multiplex fading channel.

Symbol Notations:  1 de-multiplexer  2 space-time encoder  3 spread spectrum code  4 spread spectrum decoder and diversity combiner (using H_(c) for linear combining)  5 block-wise SIC detector  6 multiplexer  7 orthogonal matrix  8 implementation of low complexity detection algorithm (1^(st) iteration)  9 implementation of low complexity detection algorithm (2^(nd) iteration) 10 implementation of low complexity detection algorithm (L^(th) iteration) M mobile station S base station TD transmitted multi-users' data stream MD multi-input multi-output channel matrix H DD detected multi-users' data stream

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

A preferred embodiment of this invention is to be described as the following:

I •System Modeling

A. System Descriptions and Basic Assumptions

Consider an MIMO uplink CDMA system over the frequency-selective multipath fading channels, as shown in FIG. 1, in which each of Q user terminals is allocated with N transmission antennas. The data stream of the q^(th) user s_(q)(k), where 1≦q≦Q, can be transmitted by using spatial multiplexing (SM), or space time block coding (STBC). Suppose S_(D) and S_(M) represent the set of the user terminals which use STBC and SM respectively, and Q_(D):=|S_(D)| and Q_(M):=|S_(M)| represent the corresponding number of users. Based on the proposal in the article of [V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block codes from orthogonal designs,” IEEE Trans. Inform. Theory, vol. 45, no. 7, pp. 1456-1467, July 1999], for consecutive p symbols in the data stream of each STBC terminal, are spatially and temporally encoded, and then are transmitted across N antennas over K-symbol intervals. At the same signal interval, each SM user then transmits NK independent symbols. Hence, the total numbers of data symbols transmitted by Q users in K symbol intervals can be represented as follows: L _(T) :=PQ _(D) +NKQ _(M)  (1)

Concretely speaking, these two space-time signal transmission mechanisms can be completely described by of the associated N×K space-time codeword matrix. In this invention, the q^(th) user's data stream s_(q)(k) can be split into groups of sub-data streams as s_(q,l)(k):=s_(q)(L_(q)k+l−1), where 1≦l≦L_(q), and L_(q) is the number of sub-data streams transmitted by the q^(th) user, which depends on the signaling mode. When a user using STBC for transmitting, i.e., qεS_(D), then L_(q)=P; and when a user using SM for transmitting, i.e., qεS_(M), then L_(q)=NK. Hence, the space-time codeword matrix of the q^(th) user can be represented as:

$\begin{matrix} {{{X_{q}(k)}\text{:} = {\sum\limits_{l = 1}^{2L_{q}}{A_{q,l}{{\overset{\sim}{s}}_{q,l}(k)}}}},} & (2) \end{matrix}$

Wherein A_(q,l)ε

^(N×K) is a space-time modulation matrix. For qεS_(D), based on REF. 2, A_(q,l) possess the following characteristics: (1) A_(q,l)A_(q,l) ^(H)=I_(N) when k=l and (2)A_(q,k)A_(q,l) ^(H)+A_(q,l)A_(q,k) ^(H)=O_(N) when k≠l, qεS_(D). Besides, {tilde over (s)}_(q,l)(k):=Re{s_(q,l)(k)} when 1≦l≦L_(q), and {tilde over (s)}_(q,l)(k):=Im{s_(q,l-L) _(q) (k)} when L_(q)+1≦l≦2L_(q). Next, the space-time coded data streams of each user are spread using spreading code and transmitted from N antennas through a frequency-selective fading channel with L_(c) resolvable paths.

Suppose that the receiving end uses M (≧N) antennas. Define y(k)εC^(M(G+L) ^(C) ⁻¹⁾ be the received chip-sampled space-time data vector at the k^(th) symbol period, where G is a Spreading Factor. Due to the time latency caused by STBC transmission signal, the inventors collecting y(k) during the interval of K consecutive symbols yields the following space-time data matrix (suppose Q users are symbol synchronized).

$\begin{matrix} {{{Y(k)}\text{:}{= {\left\lbrack {{y(k)}\mspace{20mu}\ldots\mspace{20mu}{y\left( {k + K - 1} \right)}} \right\rbrack = {{\sum\limits_{q = 1}^{Q}{H_{q}{X_{q}(k)}}} + {V(k)}}}}},} & (3) \end{matrix}$ wherein H_(q)ε

^(M(G+Lc) ^(—) ^(1)×N) is the MIMO channel matrix from q^(th) user to receiving end, and H_(q) includes the effect of spread spectrum code, and is static and constant during the interval of K consecutive symbols. And wherein V(k)εC^(M(G+Lc−1)×K) is the matrix of the channel noise matrix. Here, the inventors assume:

-   (A1) The symbol data stream s_(q)(k), for 1≦q≦Q are i.i.d with zero     mean and variance is σ_(s) ². -   (A2) Each element of noise V(k) is spatially and temporally white     with zero mean and variance σ_(ν) ². -   (A 3) Suppose that at least one user's data is transmitted by STBC     mode, and hence Q_(D)≧1. -   (A 4) Suppose N≦4, hence the length of symbol block is Pε{2,4},     based on REF 2.

B. Vectorized Signal Model

To facilitate the detection process and analysis based on the matrix linear model (3), the inventors propose to use an equivalent vectorized linear model. Suppose s_(q)(k):=[s_(q,l)(k), . . . , s_(q,Lq)(k)]^(T) is the transmission symbol block of the q^(th) user. Without loss of generality, the inventors re-number the NK symbols s_(q,l)(k) of each SM user (i.e., qεS_(M)), so that the n^(th) data group s_(q,l)(k) of K^(th) symbol, for (n−1)K+1≦l≦nK, can be transmitted from the n^(th) antenna. Define {tilde over (s)}_(q)(k):=[Re{s_(q) ^(T)(k)}Im{s_(q) ^(T)(k)}]^(T)ε

^(2L) ^(q) and {tilde over (y)}(k):=[Re{y^(T)(k)}Im{y^(T)(k)}]^(T)ε

^(2M) be the split real-valued symbol block of the q^(th) user and received signal vector, respectively. Hence, the complex-valued matrix model (3) can be re-written as following an equivalent real-valued vector model y _(c)(k):=[{tilde over (y)} ^(T)(k) . . . {tilde over (y)} ^(T)(k+K−1)]^(T) =H _(c) s _(c)(k)+v _(c)(k),  (4)

Where H_(c)ε

^(2KM(G+L cl)×2LT) is the equivalent overall (Q users) MIMO channel matrix, s _(c)(k):=[{tilde over (s)} ₁ ^(T)(k) . . . {tilde over (s)} _(Q) ^(T)(k)]^(T)ε

^(2L) ^(T) ,  (5) is the symbol vector transmitted by all user terminals, v_(c)(k) is the resulting noise item. Through dispreading and linearly combing for y_(c)(k) with channel matrix H_(c), a Matched-Filtered (MF) data vector can be obtained as follows. z(k):=H _(c) ^(T) y _(c)(k)=Fs _(c)(k)+ v (k),  (6) wherein F:=H _(c) ^(T) H _(c)ε

^(2L) ^(T) ^(×2L) ^(T) ,  (7)

and v(k):=H_(c) ^(T)v_(c)(k). With these results, the symbol detection can be realized based on model (7). To make the core ideas of this invention more clear, throughout the context the inventor will focus on the real-valued constellation case with unit-rate codes, and hence P=K so that L_(T)=PQ_(D)+PNQ_(M). Essentially the same results are also used for complex-valued constellations (possibly with half-rate codes for STBC users).

II •Matched Filter Channel Matrix

After the descriptions in this section, it will be found that the F matrix have an appealing structure. To characterize the structure of F, the inventors collect the all elements in F first, and then put them all together to examine how F is actually like. Based on the characteristics of a channel, the data stream of each user can be processed by STBC to obtain the transmit diversity, or by SM to obtain high spectrum efficiency. This leads to two signal prototypes, one for a particular signaling. Besides, among all interference signatures, there are three distinct canonical building blocks need to be identified: two of which reflect the “intra-class” interference between each pair of distinct SM or STBC users, and the other is for the “inter-class” user interference.

To further pin-down these signal prototypes signatures, recall that the numbers of symbols transmitted from an SM and an STBC user terminals transmit are P and NP, respectively, during the consecutive K(=P) symbols. Therefore, if F_(p,q) is a sub-matrix of F, representing the interference signature between the p^(th) and the q^(th) user's data streams, we then have F_(p,q)ε

•^(P×P) if p, qεS_(•D), F_(p,q)ε

•^(NP×NP) if p, qεS_(•M), and F_(p,q)ε

•^(P×NP) if pεS_(•D) and qεS_(•M). All such three Fp,q's, together with the signal signature Fp,q for either pεSD or qεSM, are specified as follows. In the sequel, O(P) is defined as a set of all P×P real orthogonal designs with constant diagonal entries (those sub-matrices, each of which belongs to a scalar multiple of IP, do fall within this category), as described in REF. 2. Suppose that F_(p,q) is the sub-matrix of F, and is used to represent the mutual coupling between the p^(th) and the q^(th) user. Then the following results hold:

-   (1) When p, qεS_(D), then F_(p,q)ε     (P). And F_(q,q)=α_(qq)I_(Po) -   (2) When p, qεS_(M), then each of P×P sub-matrix of F_(p,q)ε     •^(NP×NP) is a scalar multiple of I_(P). -   (3) When pεS_(D) and qεS_(M), then each of P×P sub-matrix for     F_(p,q)ε     •^(NP×NP) is belong to     (P)_(o)

Some explanations and discussions related to above results are further described in the following (as to the drawings for the matrix structure, see FIG. 2):

(a) Property (1) describes that at a particular situation, the data of all users are modulated for the purpose of obtaining diversity gains. Each P×P diagonal sub-matrix of F is a scalar multiple of I_(P), but each P×P off-diagonal sub-matrix of F is an orthogonal design.

(b) For p, qεS_(M), because SM signals does not impose any spatial and temporal correlations among the transmitted data streams, the interference between two SM data streams transmitted by different antennas will appear to be spatially and temporally decoupled. In particular, it is actually with constant diagonal entries since the respective propagation channels are assumed to be static during K signaling periods.

(c) Property (3) establishes a quite interesting result. The interference from SM data streams potentially retains, rather than wiping off, the orthogonal feature of O-STBC signals. A rough yet heuristic justification of this result is seen by noting that a single-antenna SM data streams may interfere with a STBC signal only through the “time” dimension. Because the SM data streams is temporally decoupled, the incurred interference might be likely to render the imbedded temporal correlation of STBC data streams unchanged. Therefore the resultant interference signatures still appear as orthogonal type matrices. This good property also holds for the single-antenna transmission in above the proposed dual-signal systems since, in this case, such a user can be regarded as a single-antenna SM data stream.

For the complex-valued constellations, there are analogue results as above descriptions, with merely possible modifications of matrix dimensions. The results are summarized in Table 1, wherein the matrix A^((i,j)) is the (i,j)^(th) sub-matrix A with proper matrix dimension.

TABLE 1 Summary of the structure of complex-valued matched-filtered cross-coupling matrix F_(p,q) Complex-Valued Constellation N = 2 (K = 2) N = 3 or 4 (K = 8) p, q ∈ S_(D) p = q : F_(q,q) = α_(q)I₄ p = q : F_(q,q) = α_(q)I₈ p ≠ q : F_(p,q) ∈ O(4) p ≠ q : F_(p,q) ∈ U(8) $\begin{matrix} {{{where}{\mspace{14mu}}{U^{({1,1})}(8)}} = {{U^{({2,2})}(8)}\mspace{11mu}\varepsilon\mspace{11mu}{O(4)}}} \\ {{U^{({1,\; 2})}(8)} = {{U^{({2,1})}(8)}\; = O_{4}}} \end{matrix}\quad$ p, q ∈ S_(M) $\begin{matrix} {p = {{q:F_{q,q}} = {\mathbb{R}}^{8 \times 8}}} \\ {F_{q,q}^{({n,n})} = {\alpha_{q,l}I_{4}}} \\ {{F_{q,q}^{({n,d})}\mspace{11mu}\varepsilon\mspace{14mu}{V(4)}},{1 \leq n},{d \leq 2}} \end{matrix}\quad$ $\begin{matrix} {p = {{q:F_{q,q}} = {\mathbb{R}}^{16N \times 16N}}} \\ {F_{q,q}^{({n,n})} = {\alpha_{q,l}I_{16}}} \\ {{F_{q,q}^{({n,d})}\mspace{11mu}\varepsilon\mspace{14mu}{V(16)}},{1 \leq n},{d \leq N}} \end{matrix}\quad$ $\begin{matrix} {{p \neq {q:F_{p,q}}} = {\mathbb{R}}^{8 \times 8}} \\ {{F_{p,q}^{({n,d})}\mspace{11mu}\varepsilon\mspace{14mu} V(4)},{1 \leq n},{d \leq 2}} \end{matrix}\quad$ $\begin{matrix} {{p \neq {q:F_{p,q}}} = {\mathbb{R}}^{16N \times 16N}} \\ {{F_{p,q}^{({n,d})}\mspace{11mu}\varepsilon\mspace{14mu}{V(16)}},{1 \leq n},{d \leq N}} \end{matrix}\quad$ $\begin{matrix} {{{where}{\mspace{14mu}}{V^{({1,1})}(4)}} = {{V^{({2,2})}(4)}\; = {c_{1}I_{2}}}} \\ {{V^{({1,\; 2})}(4)} = {{- {V^{({2,1})}(4)}}\; = {c_{2}I_{2}}}} \end{matrix}\quad$ $\begin{matrix} {{{where}{\mspace{14mu}}{V^{({1,1})}(16)}} = {{V^{({2,2})}(16)}\; = {c_{1}I_{8}}}} \\ {{V^{({1,\; 2})}(16)} = {{- {V^{({2,1})}(16)}}\; = {c_{2}I_{8}}}} \end{matrix}\quad$ p ∈ S_(D) q ∈ S_(M) $\begin{matrix} {F_{p,q}\mspace{11mu}\varepsilon\mspace{14mu} R^{4 \times 8}} \\ {{F_{p,q}^{({l,n})}\mspace{11mu}\varepsilon\mspace{14mu}{O(4)}},{1 \leq n \leq 2}} \end{matrix}\quad$ $\begin{matrix} {F_{p,q}\mspace{11mu}\varepsilon\mspace{14mu} R^{8 \times 16N}} \\ {{F_{p,q}^{({l,n})}\mspace{11mu}\varepsilon\mspace{14mu}{D\left( {8,16} \right)}},{1 \leq n \leq N},{where}} \end{matrix}{\quad\quad}$ $\begin{matrix} {{D^{({1,1})}\left( {8,16} \right)} = {{D^{({1,2})}\left( {8,16} \right)} = {D^{({2,3})}\left( {8,16} \right)}}} \\ {= {{- {D^{({2,4})}\left( {8,16} \right)}}\mspace{14mu}\varepsilon\mspace{14mu}{O(4)}}} \end{matrix}\quad$ $\begin{matrix} {{- {D^{({2,\; 1})}\left( {8,\; 16} \right)}} = {{D^{({2,\; 2})}\left( {8,\; 16} \right)} = {D^{({1,\; 3})}\left( {8,\; 16} \right)}}} \\ {= {{D^{({1,\; 4})}\left( {8,\; 16} \right)}\mspace{14mu}\varepsilon\mspace{14mu}{O(4)}}} \end{matrix}\quad$ III •Block-Wise SIC SYMBOL DETECTION

In order to separate the cross-coupled symbol data stream from equation (6), this invention proposes to adopt the SIC algorithm, based on (see REF. 1). It is known that in case of all SM signaling (i.e., Q_(D)=0), it is natural to perform the conventional one symbol per-layer (symbol-wise) SIC detection technique based on REF. 1. In this case, only L_(T)=NQ_(M) sub-data streams in z(k) will need to be detected. With a few STBC users presented leading to dual-mode signaling, due to the inherent time latency for symbol detection so as to exploit the diversity benefit for those STBC users, the receiver will receive more independent data symbols and the total L_(T)=P(Q_(D)+NQ_(M)) sub-data streams in z(k) will thus need to be detected. This leads to the receiver suffering from a related large dimension data processing and there will be an unavoidable increase in the detection complexity. However through judiciously exploiting the sophisticated usage of the algebraic structure of O-STBC, it turns out that the corresponding SIC detector can be implemented in a block-wise manner. That is, in each SIC iteration, a block of P symbols, transmitted either from a particular STBC user or from an antenna of an SM user, can be “jointly” detected. As a result, only Q_(D)+NQ_(M) iterations are needed to detect all P(Q_(D)+NQ_(M)) transmitted symbols, it is only needs Q_(D)+NQ_(M) iteration calculations, and this can actually restore the algorithm's complexity back.

Zero-Forcing Law (ZF Criterion): The inventors shall first consider the Zero-Forcing (ZF) based SIC detection algorithm, in which the optimum detection order at each iteration is found based on the maximum SNR criterion, based on REF. 1. At the initial stage, the ZF decision vector is F⁻¹z(k) and from (6), is obtained as: s _(d)(k):=F ⁻¹ z(k)=s _(c)(k)+F ⁻¹ v (k).  (8)

Equation (8) shows that, for 1≦l≦L_(T), the l^(th) symbol decision statistics, that is, the l^(th) element of s_(d)(k) is simply the desired symbol contaminated by an additive noise e_(l) ^(T)F⁻¹ v(k), where e_(l) is the l^(th) unit-standard vector of _(•)

^(L) ^(T) . It is straightforward to verify that the noise power is:

$\begin{matrix} {{E\left\{ {{e_{l}^{T}F^{- 1}{\overset{\_}{v}(k)}}}^{2} \right\}} = {\frac{\sigma_{\upsilon}^{2}}{2}e_{l}^{T}F^{- 1}{e_{l}.}}} & (9) \end{matrix}$

Because all transmission symbols have the same variance, equation (9) means that the (average) SNR of l^(th) decision channel can be completely determined by [F⁻¹]_(l,l), the l^(th) diagonal element of the noise covariance matrix F⁻¹. Smal [F⁻¹]_(l,l) implies large SNR in the l^(th) channel noise, and hence better detection accuracy the l^(th) symbol decision statistics to yield. As a result, the optimum detection order at the initial state is obtained by searching for the index 1≦l≦L_(T) at which [F⁻¹]_(l,l) is minimal. The determination of the optimal index requires the explicit knowledge of diagonal elements of F⁻¹.

Under the situation of a fixed parameter P, the inventors define

(L)_(•) as the set of all invertible real symmetric PL×PL matrices, so that for Xε

(L)^(•) we get the following results: (i) each P×P block diagonal sub-matrix of X is a (non-zero) scalar multiple I_(P), (ii) each P×P block off-diagonal sub-matrix of X belongs to

(P). Besides, denote by [F⁻¹]_(k,l) the (k,l)^(th) P×P block sub-matrix of F⁻¹, 1≦k, l≦L, and L:=Q_(D)+NQ_(M). Then the inventors can further prove that [F⁻¹]_(l,l)=β_(0,l)I_(P)• and [F⁻¹]_(k,l)ε

(P) when k≠l. This result proves that all P(Q_(D)+NQ_(M)) diagonal elements of F⁻¹ might have Q_(D)+NQ_(M) different levels. A block of P symbols can thus be simultaneously detected at the initial stage, with the optimal detection order being given by

$\overset{\_}{l_{0}} = {\arg\;{\min\limits_{l}{\beta_{0,l}.}}}$ Besides, the ZF weight matrix can be calculated from the corresponding indexed columns of F⁻¹, as W₀=F⁻¹[e_(P( l) ₀ ⁻¹⁾⁺¹ . . . e_(P( l) ₀ _(−1)+P)]ε

^(L) ^(T) ^(×P). The detected user's signal is cancelled from the received signal (4), yielding a “modified” data model for detection at the next stage.

Through the detect-and-cancel process followed by an associated linear combining of the resultant data as in (6), it can be directly verified that, at the i^(th) iteration, 1≦i≦L−1, the noise covariance matrix can be written as: F _(i) ⁻¹:=(H _(c,i) ^(T) H _(c,i))⁻¹ε

^((L) ^(T) ^(−iP)×(L) ^(T) ^(−iP))  (10)

Where H_(c,i) is obtained from H_(c) by removing i block(s) of P column (corresponding to the previously detected signals). Since F_(i) is simply obtained from F by removing the i block(s) of P column and rows, we have: F_(i)ε^(F)(L−i),  (11) and F_(i) ⁻¹ε

(L−i).  (12) Based on the foregoing discussions, the inventors conclude that the block-wise detection can thus be done likewise at each iteration. The corresponding detection order and the weight matrix can be calculated in an analogue way as:

$\begin{matrix} {\;{{{\overset{\;\_}{l}}_{i} = {\arg{\;\;}{\min\limits_{1 \leq l \leq {L - i}}\beta_{i,l}}}},}} & (13) \\ {W_{i} = {{F_{i}^{- 1}\left\lbrack {e_{{{P{({{\overset{\_}{l}}_{i} - 1})}} + 1}\mspace{11mu}}\;\ldots\mspace{20mu} e_{{P{({{\overset{\_}{l}}_{i} - 1})}} + P}} \right\rbrack}.}} & (14) \end{matrix}$

It should be noted that, the joint detection of a block of P symbols per iteration benefits uniquely from the use of orthogonal codes (O-STBC). However, it is also noted that, when the number of transmission antennas of the STBC users is greater than four, the appealing block detection property does not hold even if the orthogonal codes are used. This is because that the F has already lost the particular structure mentioned above and, as a consequence, the assertion on the inverse matrix F⁻¹ may not be true.

Minimum Mean Square Error (MMSE) Criterion: The MMSE based SIC detector is also capable of per iteration jointly detecting a block of P symbols in essentially the same manner as in the ZF case. Next, the inventors will introduce that the MMSE SIC detector can be also implemented in the same block-wise manner. In the initial stage, the MMSE weight matrix minimizing E{∥s_(c)(k)−W₀ ^(T)z(k)∥²} is obtained as:

$\begin{matrix} {W_{0} = {\left\lbrack {F + {\frac{\sigma_{\upsilon}^{2}}{2}I_{L_{T}}}} \right\rbrack^{- 1}.}} & (15) \end{matrix}$

The l^(th) symbol MSE, that is E{|e_(l) ^(T)[s_(c)(k)−W₀ ^(T)z(k)]|²}, is then computed as:

$\begin{matrix} {ɛ_{0,l} = {{e_{l}^{T}\left\lbrack {{\frac{2}{\sigma_{\upsilon}^{2}}F} + I_{L_{T}}} \right\rbrack}^{- 1}{e_{l}.}}} & (16) \end{matrix}$

Because Fε

(L), it is obvious that R₀:=[(2/σ_(υ) ²)F+I_(L) _(T) ]ε

(L)_(•) and so is R₀ ⁻¹. Hence, the block MMSE detection method can be carried out in the initial state. Starting from equation (4) and with per block detection-and-cancel process followed by an associated matched filtering as in (6); it can be checked that, at the i^(th) iteration, the symbol MSEs are calculated as the diagonal elements of R_(i) ⁻¹:=[(2/σ_(υ) ²)F_(i)+I_((L) _(T) _(−iP))]⁻¹. Since R_(i) ⁻¹ε

(L−i), this guarantees the block MMSE detection at each iteration.

From Table 1, it can be found that F does consist of orthogonal type block sub-matrices, therefore, the block-wise SIC detection for complex-valued constellation case can similarly be established. By going through essentially the same arguments as what we have done in the real symbol case, we can similarly derive a block based ZF/MMSE SIC detector, in which 2P real symbols are detected for an STBC user and 2K real symbols for an antenna of an SM user per iteration.

IV •Low-Complexity Detector

The major calculation load of the SIC algorithm is the successive matrix intersions throughout all iterations. The inventors will show that how the knowledge of the imbedded structure of F and it's inverse matrix F⁻¹, can further help to reduce the calculation load. Specifically, with the special structure of F, there is an efficient way of finding F⁻¹ by solving a set of linear equations of relatively small dimensions based on the Cholesky decomposition. Moreover, the inverse matrix required at each iteration can be calculated based on the parameters available in the previous stage.

A. A Efficient Method for Calculating F⁻¹ Using the Cholesky Decomposition

Recall that every P×P block sub-matrixs of F⁻¹ is (loosely stated) a P×P real orthogonal design. Each such a sub-matrix is completely characterized by p independent variables. It thus suffices to determine, say, its last column, and the rest can be simply obtained through appropriate linear transformations. This a priori structural information shows that the matrix F⁻¹ is completely specified by its (jP)^(th) columns, for 1≦j≦L. Hence, the calculation of F⁻¹ amounts to solving the following linear equation of reduced dimensions: FG=E,  (17)

Wherein G and E are L_(T)×L matrices whose j^(th) columns are the (jP)^(th) columns of F⁻¹ and I_(LT) respectively. To solve for the unknown G based on (17), for the j^(th) column g_(j) we must have g_(i,j)=0, for (j−1)P+1≦i=jP−1. This is because these imbedded P−1 consecutive zero entries come from the last column of the j^(th) P×P diagonal sub-block of F⁻¹. Only the non-zero entries thus remain to be determined. The Hermitian property of F⁻¹ moreover limits the number of the “actual” non-zero unknowns in each g_(j). It merely calls for the computations of those lying below g_(jP−l,j)(=0). This analysis thus implies that, for the j^(th) column g_(j) only the last P(L−j)+1 entries need to be determined, and there is a decrease in the number of unknowns by an amount P as the index j increases to j+1.

To evidence how the above structural information about G can simplify the process for solving (17), let us first perform the Cholesky factorization on F to obtain F=LL^(T), where L is an L_(T)×L_(T) lower triangular matrix (also belongs to

(L)). Hence, (17) can be equivalently rewritten as: LL ^(T) g _(j) =e _(j)1≦j≦L.  (18)

Because L is a lower triangular matrix, a typical approach to solve (18) for g_(j) is the forward and back substitutions. Because the unknown elements to be determined in each g_(j) all lie below the entry g_(jP−1,j)(=0), the forward and back substitution procedures do not have to exhaust all the entries in g_(j). It simply terminates as long as g_(jP,j) is calculated; and there is need for computing the remaining “upper” entries due to the Hermitian property of F⁻¹.

B. The Method of Recursive Calculation of F⁻¹

As mentioned above, at the i^(th) iteration, it requires to determine the optimal detection order and the associated ZF weighting matrix. With F⁻¹ obtained, in what follows the inventions will show how F_(i) ⁻¹ in each iteration can be recursively computed based on F_(i-1) and F_(i-1) ⁻¹.

From the construction of H_(c,i), the inventors observe that the matrix F_(i)=H_(c,i) ^(T)H_(c,i) is simply obtained from F_(i-1)(=H_(c,i-1) ^(T)H_(c,i-1)) by deleting one block of P columns and the corresponding indexed block of P row. Without loss of generality, it is assumed that the last column and row blocks of F_(i-1) are to be deleted. Otherwise, the inventors can simply permute those to be discarded to the right and bottom ends of F_(i-1) to fit the prescribed form. Then the inventors can thus partition the F_(i-1) as:

$\begin{matrix} {{F_{i - 1}\text{:}{= \begin{bmatrix} F_{i} & B_{i - 1} \\ B_{i - 1}^{T} & D_{i - 1} \end{bmatrix}}},} & (19) \end{matrix}$

Where B_(i-1)ε

^((L) ^(T) ^(−iP)×P) and D_(i-1)=d_(i-1),I_(P) for some scalar d_(i-1). Denote by F _(i-1) the first (L_(T)−iP)×(L_(T)−iP) principle sub-matrix of F_(i-1) ⁻¹. Then the matrix F _(i-1) is thus available from the (i−1)^(th) iteration. Based on F_(i-1) and F _(i-1), it can be shown that, after some manipulations, the inventors have the following key result: F _(i) ⁻¹ = F _(i-1) −c _(i-1) ⁻¹ F _(i-1) B _(i-1) B _(i-1) ^(T) F _(i-1).  (20) where B_(i-1) ^(T) F _(i-1)B_(i-1)+d_(i-1)I_(P)=c_(i-l)I_(P).

Equation (20) thus provides a simple recursive formula for calculating F_(i) ⁻¹, based on the block of sub-matrices F_(i-1) and F_(i-1) ⁻¹, without any “direct” matrix inversion operations. The overall low-complexity implementation of F_(i), for 1≦i≦L−1, is illustrated in FIG. 3. The above recursive approach for calculating F_(i) ⁻¹ can basically be deemed as a block based implementation of the method in [“A fast recursive algorithm for optimum sequential signal detection in a BLAST system,” IEEE trans. Signal Processing, vol. 51, no. 7, pp. 1722-1730, July 2003] introduced for the conventional symbol-wise SIC algorithm. A distinctive feature of our scheme, however, is the attractive simplification of (20), by which the computation of the inverse matrix (B_(i-1) ^(T) F _(i-1)B_(i-1)+d_(i-1)I_(P))⁻¹ is completely avoided and all we have to do is to find the scalar c_(i-1) ⁻¹. Note that to obtain F_(i) ⁻¹ through (20), we need some matrix operations, which may be complicated computation. However, based on the rich and distinctive structures of F, it can be shown that the computational complexity can be further reduced.

V •Two-Stage Block-Wise SIC Detection

As mentioned above, as a few STBC users presented, due to the inherent time latency for those STBC users, the receiver could suffer from a large dimension data processing through the conventional symbol-wise SIC algorithm [“A fast recursive algorithm for optimum sequential signal detection in a BLAST system”, IEEE trans. Signal Processing, vol. 51, no, 7, pp. 1722-1730, July 2003]. To remedy this, a “two-stage group SIC” detector is proposed to suggest that it could first detect the group of those STBC streams by the block-wise SIC algorithm mentioned above since they may appear to be relatively robust to channel conditions. With this done, by removing the detected STBC streams from the data y_(c)(k), the SIC algorithm can turn back to the conventional symbol-wise realization for recovering the group of remaining SM streams. However, in this case, the corresponding detection order of the 2-stage group SIC algorithm may not be actually optimal leading to a possible performance loss. It is also noticed that even in the optimal order sorting, the symbol-wise SIC algorithm, in fact, can be done as all the STBC streams have been detected.

VI •Computer Simulation Results

To assess the performance of the dual-signaling scheme, we consider a four-user cellular system specified as follows: 1) two transmit antennas at each handset, 2) two receive antennas at the BS, and 3) the processing gain being 16. Assume that all the four access channels are with a delay spread of five chips; two of them are spatially correlated, in which the direct-path fits the Ricean model with the same κ-factors κ=10, and the others are independent Rayleigh fading. At the BS, the SIC detector with minimum mean square error (MMSE) criterion is used for signal recovery. The mean BER, averaged over all the detected streams, is used as the overall cell performance measure. FIG. 4 shows the performances of the “all SM” signaling and the dual-signaling (Alamouti's code for diversity) strategies. The symbol constellations used are QPSK for SM and 16-QAM for diversity, respectively, so that the data rates are held constant. As one can see, the dual-mode signaling, in particular, the adoption of transmit diversity over correlated channels, does attain higher cell throughput.

The performance of the proposed block SIC detector is compared with two existing interference cancellation schemes introduced for multiuser space-time coded wireless systems, namely, the Naguib's 2-step approach [REF.3] and the Stamouli's method [REF.5]. For the previously considered four-user platform, FIG. 5 shows the average BER of the three detection methods. As we can see, the SIC based solution and the Naguib's method, which is basically a PIC scheme combined with an ML-based sorting mechanism, attain roughly the same performance. The ML ordering search of the Naguib's method, however, suffers from large computational load, especially when the number of users, or symbol constellation size, is large. The Stamouli's method, which relies on some linear transformation to each time decouple a user's signal from the data for detection, incurs a performance loss. This is because, unlike the proposed SIC solution, in which the involved nulling-and-cancellation procedures can increase the receive diversity order layer after layer, the Stamouli's decoupling-based approach merely retains an identical diversity order for each layer. 

1. A method of symbol detection for a uplink multi-input multi-output (MIMO) CDMA system that adopts dual-signaling at a transmission end to flexibly switch between transmission diversity and spatial multiplexing based on space-time channel conditions, comprising: based on the channel characteristics, data stream of each user can be simultaneously applied with orthogonal space-time block encoding (O-STBC) for transmission diversity and/or spatial multiplexing (V-BLAST) for high spectral efficiency; wherein a successive interference cancellation (SIC) algorithm detects data streams by adopting an algebraic structure of the orthogonal space-time block encoding; and the SIC algorithm can be implemented in block-wise, that the data of an STBC terminal or an antenna of an SM terminal can be detected simultaneously at each iteration; and wherein the symbol detection uses a matched filtered channel matrix with a block orthogonal structure to develop a low complexity recursion-based detector; and during the SIC detection, a weight matrix which is required at each iteration is directly obtained from result of a previous iteration, without additional matrix inversion.
 2. A method of symbol detection for a uplink multi-input multi-output (MIMO) CDMA system that adopts dual-signaling at a transmission end to flexibly switch between transmission diversity and spatial multiplexing based on space-time channel conditions, comprising: based on the channel characteristics, data stream of each user can be simultaneously applied with orthogonal space-time block encoding (O-STBC) for transmission diversity and/or spatial multiplexing (V-BLAST) for high spectral efficiency; wherein a successive interference cancellation (SIC) algorithm detects data streams by adopting an algebraic structure of the orthogonal space-time block encoding; and the SIC algorithm can be implemented in block-wise, that the data of an STBC terminal or an antenna of an SM terminal can be detected simultaneously at each iteration; and wherein the symbol detection, to solve a time delay problem caused by space-time block encoding (STBC) signaling, employs a 2-stage group SIC detection algorithm to first detect STBC data streams in block-wise and then detect remaining SM data streams by a conventional symbol-wise SIC algorithm. 